AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
Datasets
EN

AI News

View More

Pipeshift Launches Modular Inference Engine, Reducing AI Inference GPU Utilization by 75%

No description available

10.1k 5 days ago
Pipeshift Launches Modular Inference Engine, Reducing AI Inference GPU Utilization by 75%

New Transformer Acceleration Technique FlashAttention-3 Released, Costs Plummet

The groundbreaking Transformer acceleration technology FlashAttention-3 has been launched! This is not just an upgrade; it heralds a direct increase in the inference speed of our Large Language Models (LLMs) and a direct decrease in costs! Let's talk about FlashAttention-3 first, which is a significant improvement over its predecessors: Significant increase in GPU utilization: Training and running large language models with FlashAttention-3 doubles the speed, up to 1.5 to 2 times faster, and thi

10.4k 3 days ago
New Transformer Acceleration Technique FlashAttention-3 Released, Costs Plummet

ByteDance Joins Forces with Peking University to Create MegaScale: A Single 'Ten-Thousand Card Cluster' for Training LLMs

MegaScale has built a single cluster with over 10,000 GPUs, achieving a model FLOP utilization rate of 55.2%. MegaScale includes diagnostic tools for monitoring system components and events.

9.5k 10-21
ByteDance Joins Forces with Peking University to Create MegaScale: A Single 'Ten-Thousand Card Cluster' for Training LLMs
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map